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Abstract. We construct nonlinear extensions of Dirac's relativistic electron equation that 
preserve its other desirable properties such as locality, separability, conservation of proba- 
' bility and Poincare invariance. We determine the constraints that the nonlinear term must 

_ . obey and classify the resultant non-polynomial nonlinearities in a double expansion in the 

' jjj ' degree of nonlinearity and number of derivatives. We give explicit examples of such nonlinear 

["t I . equations, studying their discrete symmetries and other properties. Motivated by some pre- 

' viously suggested applications we then consider nonlinear terms that simultaneously violate 

. Lorentz covariance and again study various explicit examples. We contrast our equations 

and construction procedure with others in the literature and also show that our equations 
are not gauge equivalent to the linear Dirac equation. Finally we outline various physical 
applications for these equations. 

Key words: nonlinear Dirac equation; Lorentz violation 



2000 Mathematics Subject Classification: 81P05; 81Q99; 83A05 



(N 

> , 

^ '. 1 Introduction 

in 
in 



When Schrodinger obtained his wave equation to realise de Broglie's speculation about the wave 
nature of particles, he used a number of heuristic arguments and assumed the simplest possibility. 



' that of linearity of the equation [T] . Fortunately that assumption led to very good agreement with 

■ experiment and till today no deviations from quantum linearity have been detected eventhough 

a few low energy experiments have attempted to observe them [21 [3l HI [5] . Currently the main 
interest in nonlinear Schrodinger equations is that they appear, in form, as approximations in 
^ ' optics and condensed matter [6l [7] . 

When Dirac generalised Schrodinger's equation to the relativistic domain, he too kept lineari- 
ty. Nonlinear versions of Dirac's equation have been studied for various purposes since then. 
Heisenberg's proposal was in the context of field theory and was motivated by the question of 
mass. In the quantum mechanical context, nonlinear Dirac equations have been used as effective 
theories in atomic, nuclear and gravitational physics [9| 1101 flT]- Some of the simpler versions 
have been analysed rigorously [12]. 

Although there is as yet no evidence for fundamental quantum nonlinearities, their absence 
is seen as a puzzle by several authors and requires an understanding [13l [HI |T5l [HI [T7j . Based 
on an extrapolation of some information theoretic arguments at the non-relativistic level, it was 
proposed in [18] that perhaps quantum linearity might be intimately tied to Lorentz invariance 
and that the possible violation of the latter at a fundamental level might lead to quantum 
nonlinearities. If true, then perhaps the appropriate regime to seek such inter-related violations 
would be at high energies or at very short distances. 

Since quantum nonlinearities, if they exist, must be very small, the best place to search for 
them is where they might show up at leading order, not masked by other corrections. Thus one 
hopes to detect the nonlinearities at the quantum mechanical level, rather than as supplements to 
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loop effects in field theory. Neutrinos are therefore an ideal probe of such potential nonlinearities 
as they are weakly interacting and so not affected much by field theory corrections. 

Indeed, neutrino oscillations were suggested in [TB] as one place where quantum nonlinearities 
might be relevant and a heuristic study was conducted using a provisional nonlinear Dirac 
equation. That equation was very complicated and it did not conserve probability. 

In this paper we discuss Dirac equations, at the quantum mechanical level, which preserve 
all the others desirable features such as conservation of probability. We intend to use these 
equations to study not just neutrino oscillations but also various other high-energy phenomenon 
which are briefly discussed in the last section. 

However it is possible that our equations might also be relevant as approximate equations, 
for use either in particle physics or condensed matter physics, and we discuss this also in the 
concluding section. 

As there are various obstacles to generalising the non-relativistic information theory approach 
of [H] to the relativistic domain, we proceed in a different manner here. We write the nonlinear 
equation as 



where F is a function of the wavefunction ip, its adjoint and their derivatives^. We begin 
by requiring, just as for F = 0, that equation (II. ip be local, Poincare covariant, conserves 
probability and is separable for multi-particle states. The constraints on F are then solved in an 
expansion procedure to be detailed in Section [2.5.11 That is, we implement a systematic scheme 
to construct a large class of nonlinear extensions of the Dirac equation. 

The constraints we adopt are similar to those used in understanding non-relativistic quantum 
theory in \19\ I20j . There it was deduced that the Schrodinger equation is the unique single 
universal parameter (h) extension of classical ensemble dynamics. Although the speed of light, c, 
is a universal parameter for relativistic dynamics, it already appears at the classical level and 
plays the role of converting the dimensions of space to those of time. One expects that further 
extensions of quantum theory either at the non-relativistic or relativistic level would involve 
other universal parameters, for example a universal length. 

Our approach and most of our results differ from previous constructs of nonlinear Dirac 
equations in the literature. Most studies [HI [T0\ [TT] do not impose separability, which is a strong 
constraint that leads to non-polynomiality of F. In [2T] separability was imposed in a somewhat 
different manner from what we do here, but more importantly the authors of [21] only considered 
nonlinear Dirac equations that are obtained from the linear Dirac equation through a process 
of gauge-completion: thus their class of equations is more restrictive than ours. Some further 
contrasts of our procedure compared to others [22l [23] is that we allow derivatives of the wave- 
function in F, and also study nonlinearities which violate one or more of the discrete V, C, F 
symmetries as such cases are expected to be phenomenologically relevant. 

Furthermore, proceeding with the suggestion of [18], we also construct versions of (jl.ip that 
are simultaneously Lorentz violating and nonlinear: such equations have also not been studied 
before in general; however we note that one example of such an equation, without derivatives in 
the nonlinearity, has been studied in |24[ [25] , motivated by anisotropic space-times |26j . 

The rest of the paper is structured as follows: In Section [2] we discuss and make explicit 
the various constraints on the nonlinear term F; we note that the class of nonlinearities we 
consider can also be motivated without imposing separability and so are potentially useful also 
as effective equations at low energies. The simplest examples of such equations are discussed in 
Section [3] followed by their plane-wave solutions and the corresponding dispersion relations in 



{i^df^ -m + F)ip = 0, 




^But we do not consider F's that have free derivatives acting to the right on the final ip of the equation (|l.ip . 
So our nonUnearity is a matrix in spinor space with spacetime dependent coefficients. 
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Sectional In Section [5] we study examples of F that simultaneously violate Lorentz covariance. 
In Section E] we illustrate more complicated examples of the nonlinear equations and also discuss 
the alternative approach whereby the nonlinear equations are obtained from a Lagrangian. In 
Section [7] we explain how to distinguish our nonlinearities from those that may be obtained from 
the linear equation through a nonlinear gauge transformation. A summary and outlook is in 
Section [8l 

Although the evolution equation (jl.ip has been modified, we keep the usual kinematical 
structure of quantum mechanics; some arguments, that fundamental nonlinear quantum theories 
are intrinsically pathological, are discussed in the final section. The conventions we use are 
similar to those in the textbook [27]; unless stated, our discussion is representation independent. 
Although we work in 3 + 1 dimensional flat spacetime with metric g^^'^ = (1,-1,— 1,-1), some 
effects of gravity could possibly be encoded in an effective nonlinearity; we do not study in 
this paper explicit couplings to gravity though this might yield some interesting consequences 
as seen for the linear Dirac equation [28]. 

2 Constraints 

The usual, linear, quantum-mechanical ("first quantised") Dirac equation has many appealing 
properties which we will mostly preserve so as to achieve a minimal deformation. Later in 
Sections [5] and m we discuss the possibility of further extensions motivated by physical considera- 
tions. 

We now list and explain the various constraints that we are going to impose on the nonlinear 
Dirac equation and hence on F in (jl.ip . 

2.1 Locality 

We continue to assume that physics, as described by the wavefunction is accurately captured 
by a local evolution equation: that is we require F to depend only on ■0, its conjugate and their 
derivatives all evaluated at a single point x. Note that F below is in general a matrix in spinor 
space though later we will specialise to various cases, such as F proportional to the identity 
matrix. 

Notice that we demand locality of the equations of motion rather than of a Lagrangian. 
This means that some of our equations might not be obtainable from a local Lagrangian. One 
could of course implement a construction procedure similar to that described below at the 
local Lagrangian level: we illustrate this in Section [6] and discuss the relative advantages and 
disadvantages. 

2.2 Poincare invar iance 

Under the Poincare transformation x' = Ax + a the linear Dirac equation is covariant if the 
wavefunction transforms as ^\ 

/(x') = 5(A)'0(x) = V (A-i(x' - a)) , 

where 5-i(A)7'^A'^^5(A) = Explicitly we have 5(A) = exp(-ia„/3a;"^), with the 
transformation parameters. If we demand that the nonlinear equation (jl.ip be covariant under 
the same transformations then we obtain the following constraint, 

S-^{k)F'S{K) = F, 

where F' is the Poincare transformed F; recall that F is a function depending on and their 
derivatives. 
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2.3 Hermiticity 

In quantum mechanics we usually require the Hamiltonian to be Hermitian so as to guarantee 
reality of eigenvalues. Rewriting the nonlinear Dirac equation in Hamiltonian form we have, 

= {Hd - (5F) i;, 

where f3 = and Hd is the linear Dirac Hamiltonian. Since hIj = Hd, thus we also impost 
^o^t^o ^ p (2.1) 



2.3.1 Current conservation 

In terms of the familiar adjoint ^ = i^^^^, the linear Dirac equation has the conserved current 
r = i'l^^, (2.2) 

which allows ijj'^ip to be interpreted as a probability density. The divergence of the same expres- 
sion (|2.2p in the nonlinear theory is 

a^/ = ^(iF-i70FV)V', (2.3) 

which vanishes due to the Hermiticity condition (|2.ip . 

Thus requiring Hermiticity of the Hamiltonian also ensures conservation of (j2.2p . On the 
other hand, in some future applications, we may want to consider non-Hermitian Hamiltonians 
that model open systems. Then the right-hand-side of (j2.3p can be used to measure leakage 
from the system. 



2.3.2 Chiral current 

For completeness we also discuss the chiral current, for which the expression in the linear theory 
is is = ipl^Jnip- Using the nonlinear equations of motion, we obtain 

For the usual chiral current to be conserved in the massless, m — > 0, limit of the nonlinear 
equation, we require 

75F + 7°i^^°75 = 0, 

which, on using the Hermiticity condition (j2.ip . simplifies to 

{i^,75} = 0. 



2.4 Universality 

The usual Dirac equation has the property, as all linear equations do, that it is invariant under 
a rescaling of the wavefunction, ip Xtp. In quantum mechanics such a condition allows 
solutions of the equation to be freely normalised, which is not only convenient but also sometimes 
demanded for an interpretation of measurements [HI [T^ [IE] . 

We would like our nonlinear generalisation to preserve the same scale-invariance property, 
which one may motivate with alternative reasoning as follows. We desire equations that are 



^Recall, we are adopting the standard kinematical structure of quantum mechanics, in particular the standard 
inner product. See also the first footnote. 
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as universal as possible. So, for example, the equation should have the same form whether it 
describes a single particle or a system of particles. More specifically, the parameters describing 
the strength of the nonlinearity F should not be dependent on the number of particles in the 
system, just as Planck's constant h is universal in the multiparticle Schrodinger equation. 

If ijj represents the wavefunction for a A'^-particle state, then the normalisation of probability 
implies that the dimension of depends on N , just as in the non-relativistic case [191 120j . and 
so the dimension of F would then be dependent in general. We can avoid this conclusion by 
requiring that F have the above-mentioned scaling property 

F(AV) = (2.4) 

where we mean that the wavefunction and its conjugate are all scaled by the same factor A on 
the left-hand-side. Equation p.4p implies that F must be non-polynomial, 



F^F{A/B), (2.5) 
where A, B have equal factors of the wavefunction. 

2.5 Separability 

The usual Dirac equation may be used to describe a collection of particles and is separable for 
independent subsystems. It seems useful to have this separability property also for our nonlinear 
generalisation. However as we will explain in a later section, one may omit the separability 
constraint in favour of other arguments which result in similar forms for the eventual F's, and 
those forms anyway become separable with a suitable interpretation of the multiparticle states. 
Thus with the same structure for F we can use the equation for fundamental, phenomenological 
or effective dynamics. 

Let us review separability first for the linear Dirac equation so as to motivate suitable defi- 
nitions of and for many-body systems. In the multi-time formalism [291 [30l I31j . which 
preserves manifest Poincare invariance, the many-body linear Dirac equation for non-interacting 
particles may be written as 

{il'^d^,s -ms)iJ = 0, 

s 

where 

^ = (g) ^2 ® • • • • • • , 

s^^ site 

7^ = 1 1 (g) • • • 01 (g) • • • , 

s*'^ site 

= 1 1 (g) • • • (g) m(^) (gl (g • • • 

s^^ site 

and s labels the particle. Consider explicitly the two-particles case, 

[{il^d^^i - m(i)) ^i] ®^2 + ^i® [(^7^9^,2 - m(2)) ^2] = 0. (2.6) 

Let (j)i and (f)2 be arbitrary single particle wavefunctions for the two independent variables 1, 2. 
Then multiplying by i^i (g <^2/('^i^i)(02V'2) on the left of (j2.6p . we have 
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The result is clearly separable in that solutions of the individual single particle Dirac equations 
satisfy the two-particle equation and vice-versa. 

Furthermore it is easy to show that if ^ for a many-body system is defined to be V' = 
■01 (8) ■02 ® • • • ® V's ® • • • then the two-particle adjoint equation that follows from (j2.6p will have 
the same form as the one-particle equation, and since form-invariance is in the spirit of the 
universality criteria of Section [2.41 this justifies our definition. 

Now consider, as an example, the expression for the multi-particle current j^. Multiply (j2.6p 
from the left by 0i ^ "02; multiply the adjoint of (12. 6p from the right by ipi <^ ip2, and take the 
difference to get 

[S^.ljfl)] ® 02^2 + V5l01 ^ [9^,2if2)] = (2-8) 
2 

s=l 

where the current is defined to be 

= Tpitpi (g) ■0202 ® • • • (8) 'S'lps+ii's+i ■■■ ■ 

s^^ site 

Multiplying (12. Sp by {tpiipi (g) 0202) gives 
JlLJil 1 + 1® JlJfl = 0. 

01'01 0^20^2 

Thus conservation of individual currents implies the conservation of the two-particle current and 
vice-versa. 

Similarly the definition 

= 1<S)1<S) ■ ■ ■ <S) 7^ 01 • • • (2.9) 



allows the multiparticle chiral current to be defined and in the massless limit conservation of 
individual chiral currents implies the conservation of the two-particle chiral current and vice- 
versa. 

2.5.1 Structure of F 

We would like our nonlinear equation to be separable in this minimal sense: for a wavefunction 
which is the product of two independent states, the composite equation should decompose into 
two independent equation^. Looking at the expressions (j2.6p . (j2.7p we see that for the nonlinear 
equation (II. ip to be separable as such, we require F to decompose as 

F{ipi V2) = F{ipi) 1 + 10 F{ip2) 

for a state made up of two independent particles (constraints of this type have been studied 
before for non-relativistic systems in [33] )• Equation (j2.5p and the examples above suggest that 
this can be achieved if we have the structure 

F { — \ > — 01 + 10 — . 

Thus for a product state we require M — > A/i P2 + ^^i -^2 while P — > Pi P2- 



^We note that other implementations of separabihty might lead to more constraints, see for example [32] 
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Requiring N and T) to be separately Poincare invariant, we see that the only functional of ip 
that would decompose as required for T) is V'V' ^'^^ powers thereof. Thus our nonlinear term F 
can be a sum of terms of the form 



subject to the other constraints that have yet to be imposed. 

Our deduction of (j2.10p has been somewhat heuristic and so the reader may prefer to think 
of it as an ansatz within which we discuss our equations. 

As mentioned earlier, the separability condition is appropriate for fundamental equations that 
describe an arbitrary collection of particles. However if the nonlinearities are an approximate 
description of an underlying dynamics, as effective equations attempt to do, then the universality 
and separability arguments do not seem appropriate. However even then one may motivate the 
structure (I2.10p as follows. Generally, for slowly varying fields, one may perform a gradient 
expansion for F when seeking local equations, 



where the A'^j's depend on the wavefunction and contain exactly i derivatives. The Dj's also 
depend on the wavefunction but do not contain any derivatives. 

Now in most nonlinear Scrodinger or Dirac equations the nonlinear terms break the scale 
invariance, ip — > Xip, present in the linear theory. That is, typically the nonlinearities make the 
equations sensitive to the amplitude of the fields thus giving rise to very interesting phenomena. 
However it is possible to have nonlinearities that preserve the scale invariance of the linear theory 
and though the effects are then likely to be milder, they can still lead be novel and interesting 
effects [331 [35]. So if we focus on such "soft" nonlinearities, and also impose Lorentz invariance, 
then ()2.1ip is included in the form (12.10p . Indeed, as we shall verify later, even without imposing 
separability at the outset, separability of the resultant structures appears to be possible with 
consistent definitions of the multi-particle states. 

In summary, we will discuss in this paper the class of nonlinearities of the form (|2.10p by 
looking at several cases corresponding to a specific degree of nonlinearity, n = 1,2,... , and 
a derivative expansion of the numerator. 

We remark that the scale-invariant nonlinearities ()2.10p we introduce here might also be 
interesting for future quantum field theory investigations: these nonlinearities correspond to 
Lagrangians that are still naively power-counting renormalisable. 

2.6 Discrete symmetries 

The Standard Model of particle physics encodes both parity and CV violation as these are em- 
pirically observed facts. Thus in our nonlinear equation we find it interesting to allow violation 
of individual symmetries. However in line with general theorems \27\ [36] on local, Hermitian, 
Lorentz covariant theories, we do find by explicit verification that our specific examples preserve 
the combined VCT invariance although we do not impose it. 

The discrete symmetry operators are the same as in the linear theory [27], and they place 
constraints on the nonlinear term F so that the nonlinear equation (jl.ip is form invariant (similar 
to the discussion in Section [2^2]) . 

Ignoring unobservable phases, the representation independent parity operator is "P = 7^ and 
parity invariance requires 



AA(Vi,^) 



(2.10) 



F ~ — 





V-'FpV = F, 
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where Fp is the parity transformed F. Charge conjugation invariance is achieved if 
C-^FcC = F\ 

where Fc is the charge conjugated F and C = i^"^ in the Dirac-Pauh representation. The 
time-reversal invariance constraint on F is 

T-^FtT = F*, 

Ft being the time reversed F and T = 27^7^ in the Dirac-Pauh representatior0. 

Under the combined VCT transformation, 0, the nonhnear Dirac equation in invariant if 

e^^Fee = F, 

where Fq is the VCT transformed F. The representation independent form for Q is proportional 
to 7^. 

3 Explicit examples of nonlinear equations with F oc I, n = 1 

We found earlier in Section [2.5.11 that F has the form 



(3.1) 



where the number of factors of the wavefunction in the numerator is 2n. 

In the absence of other dynamical fields, Poincare invariance requires spacetime indices of 
matrices like 7^^ to be contracted among themselves or with derivatives d^. We will assume that 
the spinor indices of ip and ip are contracted in the natural way with ip acting like a row vector 
and ifj a column vector, for example ~ A'ipBipC where A, B, C are matrices in spinor space. 

In this Section we restrict the explicit discussion to the important case where F is proportional 
to the identity matrix I in spinor space, 

F = fl (3.2) 

and so the nonlinearity / may be thought of as a spacetime dependent mass. This choice is 
motivated by our interest in neutrino oscillations. We also consider here only the lowest order 
of nonlinearity, n = 1. In Section [U we discuss some other types of F. 

Current conservation for the case p.2p simply amounts to the statement that / is a real 
function of the wavefunction, 

/ = /*• 

3.1 No derivatives 

In the absence of derivatives, the most general structure of the nonlinear term with F (x I and 
n = 1 is given by 

- (3.3) 



''We remind the reader that we are treating the wavefunction as a classical object rather than as a Grassmann 
variable. Thus the denominator of (|2.10p obtains a negative sign when performing a transpose operation, such as 
occurs in charge conjugation. Such negative signs mutually cancel for our scale-invariant nonlinearities p.lOp . 
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where A is a matrix. In the absence of other fields which carry spacetime indices we must 
therefore have 

A = al + ibj^, 

where a, b are constants. The a term is clearly equivalent to a mass term in the linear equation 
and so may be ignored in the following discussion. Notice that the form A = ib^^ in (|3.3p . 
which is a consequence of Lorentz invariance, also automatically satisfies the VCT invariance 
condition. 

As for individual discrete symmetries, using the equations of Section 12.61 we see that the 
term with b ^ preserves C invariance but breaks parity. Time-reversal invariance requires b 
to be purely imaginary, which conflicts with the requirement from current conservation which 
requires b to be real. 

We thus conclude that our simplest nonlinear equation, with F I and n = 1, 

r = — . 

unavoidably breaks V and CP, something that is surely intriguing from the perspective of particle 
physics phenomenology. We have indicated the small nonlinearity parameter by e. 

Note that the multiparticle version of the above equation is separable, so that does not impose 
additional constraints. Nonlinear Dirac equations without derivatives in the nonlinear part have 
been studied in [22l [23] and (|3.3p is a special case of the equations studied there. 

3.1.1 Lorentz vs T-'CT invariance 

Let us discuss the situation whereby the VCT invariance is imposed on (j3.3p first. Then we 
find, using 6 oc 75, that we require 

[A7']=0, 

which is satisfied if A has the form 

^ = a/ + 675 + cf'^a^u. 

If there are no other dynamical fields other than the wavefunction, then c^'^ can only be a con- 
stant background field, thus explicitly breaking Lorentz invariance. Indeed, explicitly imple- 
menting Lorentz invariance of (|3.3p gives 

S{Ay^AS{A) = A, 
which for the infinitesimal case gives [A, daf^] = 0. This only allows 

A = al + 675 
as we argued earlier. 

In other words, we can have VCT invariance even if we give up Lorentz invariance, which 
again is consistent with general results in the literature \27\ [36] . 

3.2 One derivative 

The most general form of F is now given by the linear combination of the following two terms, 
{d^,i!) AjI'B^Ij tjjCji'Ddi.'il; 
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As in the no derivative case, Lorentz covariance requires that both A, B be proportional to 
a linear combination of /, 75 and so we may write 

p ^ (g/,^) {al + ilrf^)-t^ip ^ {^{cl - id75)7^9^V^ 

a result which also satisfies VCT invariance. Hermiticity of this -P, and hence current conserva- 
tion, is satisfied if we have c = a* and d = b* . Clearly parity invariance is violated if 6 7^ 0; in 
that case C invariance requires b to be purely imaginary while T invariance requires b to be real. 
The constant a is not constrained by parity but both C and T invariance separately require a 
to be purely imaginary. 

Let us consider the special case where each of the discrete symmetries is individually pre- 
served: 6 = and a = ie with e a real parameter that controls the strength of the nonlinearity. 
Then we may write, using explicitly the on-shell current conservation condition, 

f = 2«Mh!*=-2.e*2S^. (3.4) 

For e small, one may simplify F in ()3.4p by solving the nonlinear Dirac equation (jl.ip itera- 
tively. To leading order {i'j^d^ — m)ip = which when used in F gives F = —2em. Thus to 
leading order in e the nonlinearity (j3.4|) is just a mass shift. 

We remark that just as in the no derivative case, we could have imposed VCT invariance 
first and obtained cases which violate Lorentz covariance. However we defer further discussion 
of Lorentz violating cases to a later section. 



3.3 Two derivatives 

There are well-known problems in constructing Lorentz covariant higher-derivative first-quanti- 
sed theories. Consider a normalised state, 

1 = j d^xip^. 

Applying ^ to both sides gives 
= y d^a;('(/it^ + ^t^)_ 

Now, if the evolution is second-order in time, then one can specify 1^(0, x) and ip{0,x) indepen- 
dently and that would mean that the right-hand-side of the above equation need not be zero in 
general, leading to a contradiction. 

However, in our nonlinear equations, Hermiticity and hence current conservation are ensured 
by construction and so the above-mentioned problem does not occur. This of course does not 
guarantee that all other physical quantities will be well-behaved, but it is plausible that that is 
the case if the higher-order terms are treated perturbatively. 

The general structure of the two-derivative nonlinear term, F oc I, without embedded 7 
matrices is 

^ a [d^dt'i)) ^ + bi>d^dt'i) + c [d^i)) 

r = = . 

Each numerator/denominator term is separately Poincare and VCT invariant. However while 
each term is also separately parity invariant, C or T invariance requires all the coefficients a, b, c 
to be real. 

Current conservation, F = F'^ implies that b = a* and c = c*. Thus we conclude that for a 
not real, both C and T (or CV) are violated. 



Nonlinear Dirac Equations 



11 



4 Plane-wave solutions and dispersion relations 

We wish to construct plane-wave solutions to the nonlinear equations of the previous section. 
As in the case for the linear theory, we require the solutions to be simultaneous eigenstates of 
momentum and energy. Let us clarify what this means in the nonlinear theory. 

Although we allow the equations to be nonlinear, we keep the fundamental commutation 
relation between the position and momentum operators, [x^p] = ih. Thus in the Schrodinger 
representation we have p = —ihd and the momentum eigenvalue is given by pipp = pipp. Likewise 
the energy-eigenvalue equation is given by ihdtipE = EipE- 

With Lorentz covariance preserved, the method to find plane-wave solutions is similar to the 
linear case. We seek solutions of the fornix 

ijix,t) = e-'''-''u{k) (4.1) 

with a four vector. 

The dispersion relations will be covariantly modified from that of the linear theory. Consider 
the nonlinear Dirac equation, 

idtip = [ia-d + Pm- ip (4.2) 

for the case F = fl where a* = 7*^7*. Substituting the plane wave ansatz into the above 
equation, squaring this and re-arranging gives 

^tfc2^ ^ J^(^^)]2^. (4.3) 

Thus we have, 

k'^ = [m - f (k^)]^ . (4.4) 

(Since equation ()4.3p is covariant, then / must be also covariant.) 

The solution of ()4.4p requires the explicit form for /, the nonlinear term. It may also require 
the explicit form for the plane wave solutions which we discuss next. Note that from the above 
expression, one may view the effect of the nonlinearity for plane wave states as giving rise to an 
effective mass. 

Assume m ^ 0. Then in the rest frame we have from (|4.ip . (14. 2p . 



Mu = [(3m - l3F{u)] u, (4.5) 

where the rest energy has been labelled by M > 0. 

For the case F cc I, the rest frame Hamiltonian is therefore proportional to 7'^ = /? and the 
eigenstates are as in the linear theory [27]. These can then be boosted as usual to obtain the 
general solutions. The net result is similar to the usual spinor solutions of the linear theory but 
with the effective mass M in place of the bare mass m, 

E'^ = + M'^. 

The expression for M in terms of m and the nonlinear parameters can be determined by sub- 
stituting the rest frame spinors into ()4.5p . 



^We have set /i = 1. 
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4.1 Perturbative method 

The procedure of boosting rest frame solutions is valid if Lorentz invariance is a symmetry of 
the theory. If we relax the constraint of Lorentz invarianc^, we will not be able to use this 
method to find the energy dispersion relations. Thus we will now introduce a method to obtain 
the energy dispersion relation, to leading order in the nonlinearity, even if we do not know the 
exact plane wave solutions to the theory. 
From (j4.3p . we have 

E'^ = k'^ - 2mf + f. 

Since the nonlinear term will contain a small nonlinearity parameter e, we can explicitly factor 
it out. That is, / = ef. Then to leading order, we have 

= k^ - 2emf. 

Now we assume the following, 

F = A:(°)^ + 0(e), u{k) = n^") {k'^^^) + 0(e), 

where A;(o) and ^(^^(/^(o)) are the usual 4-momentum and ti's for the linear theory. Thus to 
leading order in e we have 

i?2 = (fc2 + r^2) -2em/(nW(A:W)) + 0(e2) = {k^ + m^) - 2emf{u^^\k)) + O {e^) .{4.6) 

Note that in the last step, we have replaced k^^^ by k. This is alright because we are dropping 
terms that are order or higher. 

The perturbative method allows us to find corrections to the linear theorie's energy dispersion 
relation. We only need to substitute linear plane wave solutions into the nonlinear term. Note 
that the above method works only for the massive theory. If we consider the massless limit then 
we might need to keep terms that are of order e^. 



4.2 Example 

We look at an explicit example corresponding to F oc / and n = 1 with two derivatives, and 
obtain the corresponding expression for the effective mass M for plane wave states. Although 
one can work covariantly with the expressions (j4.4p . it is faster to work in the rest frame, that 
is by using (|4.5p . 

Consider the nonlinear term when each of the discrete symmetries is preserved, 
^ adt'd^ (^V) + (c - 2a) (c>^^) (9^^) 

r = . 

Substituting the plane wave solution, the first term drops out leaving 
F = {c-2a)M'^ = eM^. 

Thus 

M2 = (m-eM2)^ (4.7) 



^Here we refer to violation of particle Lorentz invariance while keeping observer Lorentz invariance. This can 
be done by introducing background fields, see Section (5] 
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Taking the square root and solving we get 



M = 



=Fl ± Vl + 4em 
2e 



(4.8) 



For the rest energy to be real, we need e > — Let us consider the case where e > 0. Then 
since we have taken M > by convention, only the following two of the four solutions in (|4.8|) 
are physical: 



There are therefore two legitimate positive energy solutions for < e ^ 1. This is because 
the equation (14. 7p is a quartic equation instead of the usual quadratic which arises when only 
first-order derivatives appear in the Dirac equation. The first possibility in (j4.9p represents 
a perturbation to the usual rest mass and is seen also in the direct perturbative approach 
of (|4.6p . It results in the dispersion relation 



The other solution in ()4.9p represents a non-perturbative mass generation that exists even 
when m ^ 0. 

5 Lorentz violating nonlinear equations 

There are various ways of motivating the study of Lorentz violating theories. For example, at 
short distances space might not be smooth and so dynamical equations might require higher- 
spatial derivatives to adequately describe the situation. However if one still restricts the time 
derivatives to first or second order, to avoid potential causality problems, then clearly one has 
to give up on Lorentz covariance. 

We will consider nonlinear terms F which simultaneously violate Lorentz invariance |18j . 
The Lorentz violation will be implemented via constant background fields: in the terminology 
of |371 [38] our equations will preserve the observer Lorentz covariance but break the particle 
Lorentz symmetry which involves boosting the particles and local fields but not background 



In this part of the paper we illustrate some of the possibilities rather than work out all cases 
as this becomes tedious and is better left for specific applications. 

As Lorentz violation is constrained by phenomenology to be small |39[ I40j . we may use 
perturbative methods to determine the corrected dispersion relations. 

5.1 An example: no derivatives 

If the Lorentz violation is described by background vector fields, then for F ck I and n = 1 we 
may write 



M = 



=Fl + y/T+Tem 
2^ 



In the limit e <C 1, we have 




(4.9) 



fields [371 [38]. 




(5.1) 



where and are constants; current conservation requires them to be real. 
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Under a VCT transformation of the spinors alone in (j5.ip we have F — > —F. Thus we have 
here our first example of VCT violation associated with Lorentz violation. However it is possible 
to maintain VCT while still violating Lorentz covariance. Consider 

^2 — ^ h IDafj ^ , 

where A^/s and 3^/3 are real background tensor fields. Both current conservation and VCT 
invariance are satisfied in this case. 

The dispersion relation for perturbed plane waves can be obtained using the perturbative 
method, equation ()4.6p . For example, for the case 

F = A,^ 

we get F = —. Thus E"^ = k'^ + m'^ - 2A ■ k. Notice the correction is 0{k). 



6 Other cases 

In this section we look at some other examples of nonlinear equations within the class (j3.ip such 
as those with higher nonlinearities, n > 2, or with F oc 7^. We also discuss the Lagrangian 
approach and some examples of nonlinearities outside the class (j3.ip . 



6.1 Lorentz invariant equation with F oc I, n = 2 

For simplicity we consider here only cases where there are no derivatives in F. An example is 
given by 

It is Poincare invariant and invariant under each of the discrete symmetries while Hermiticity 
requires e to be real. It is easy to verify, using the definition from Section [23] that F is separable. 



6.2 Lorentz violating equation with F oc 7^, n = 1 

Here we consider an F that is proportional to 7^^. Such terms will allow the chiral current to be 
conserved, as discussed in Section 12.3.21 If we exclude derivatives then the simplest possibility 
is to let the Lorentz index of the gamma matrix contract with that of the background field A^, 



iA^y 



tptp 



Hermiticity requires the background field to be real. This F individually breaks all the discrete 
symmetries and is VCT odd! It is separable. 



6.3 Equations from a Lagrangian 

There are both advantages and disadvantages in using a Lagrangian approach. Firstly, a local 
equation does not necessarily have a local Lagrangian. Also, even though a Lagrangian might 
be simple, the resultant equations of motion might look complicated. On the other hand, 
it is probably easier to discuss conservation laws corresponding to symmetries starting from 
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a Lagrangian. Another possible advantage of a Lagrangian approach will appear after we look 
at an example. 

Consider the Lagrangian density 

Suppose, for simplicity, ^nl contains no derivatives. Then the equation of motion will reduce 
to 

n OuW - IT^W H = 0, 

dip 

which is a similar to ()1.1|) and so we label the last nonlinear term here as -Fe.o.m.'^- As an 
example, using, 

^^^^ _ {jjAjj) {ijBj;) 

gives 



-Pe.o.m.V' 



a^NL B^,(t^\ (w)(W) 



Thus we see that a n = 1 nonlinearity in the Lagrangian will introduce a mixture of n = 1,2 
terms into the equations of motion. This then might be one advantage of the Lagrangian 
approach: it generates constrained complexity from simplicity. 



7 Gauge inequivalence 

It is possible to generate a nonlinear equation from the linear Dirac equation through a nonlinear 
gauge transformation [21]. The transformed equation is equivalent to the original equation in 
the sense that the probability density is an invariant. Here we show that the nonlinear terms 
we have investigated in this paper cannot be obtained by performing a gauge transformation on 
the linear Dirac equation, and so represent genuine and distinct nonlinear structures. 
We define the following gauge transformation. 

where 6{x) is a function of "i/i's and ip^s. In general, we will treat 6{x) as a 4 x 4 matri^lll- We 
require that the probability to be invariant under the gauge transformation, 

and so 9^ = 9. 

Under an infinitesimal gauge transformation of the linear Dirac equation we get 

(1 - i9) {ij''d^ - m) (1 + i9) V' ^ 0, 

(i7^a^ - m) V + [9, 7l df^iP - [d^O) ^ ~ 0. (7.1) 

We wish to identify the 9 dependent terms with the nonlinearity F in our nonlinear Dirac 
equation (II. ip so we set 

'^For ease of notation, we will often suppress the x-dependence in and i/i. 
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Thus 

i^Fi: = i, [e, 7l d^il: - {d^e) V;. (7.2) 

We note that equation (j7.2p is not symmetric in 5^-0 and so this representation of F is not 
Hermitian. In order to obtain a symmetric equation, we will repeat the above steps on the 
adjoint Dirac equation (this also removes any ambiguity when taking the adjoint of d^). 

For the adjoint equation, we have for an infinitesimal gauge transformation, 

= i>{iYK + + {d^i>) [7°^7°7'' - 7^^] + V^7° i^^,0) tS'^ + imi> {0 - ^6^) . (7.3) 
Now the adjoint of (|l.ip is, upon using the Hermiticity constraint (j2.ip . 

{il^df, + m)-i;F = 0. 
Thus comparing with (j7.3p we label 

^F = - {d^i)) [7°^7%'^ - 7^'^] - 1/^7° {d^,e) -f^-f^" - irm/j {9 - 7°07°) . (7.4) 

Multiplying (j7.4p by il^ from the right and adding to ()7.2p gives 

2i;Fi; = i, [^,7^] d^tl: - {d^^) [70^7%'^ - 7^^^] ^ 

- i) [d^ {-f^'O + 7° 07%'')] V - imi) [O - -f^Oj^) ip. (7.5) 

The left hand-side is Hermitian if the constraint ()2.ip on F is applied. But the adjoint of the 
right-hand-side is 

(a^V') (7''7°07° - j^e^j'') {-ff'e - j^e-f^j'') d^ip 

- i) [d^ (7°07°7'' + 7'^^)] + im-^ (7°07° - 0) ^. (7.6) 
Comparing (|7.5p and (|7.6p . we require 

7O07O = e = [0,7°] = 0. 
Then (|7.5p becomes 

2i^F^ = [6, 7''] d^^ - {d^{p) [6, 7/^] V - [d^ {9, 7^^}] V'- (7.7) 
So far we have deduced two constraints on 6, 

e = e\ [0,7°] =0, 

coming respectively from the invariance of the probability density and Hermiticity. These are 
necessary constraints for a nonlinear equation generated by gauge transformation to be equiva- 
lent to a theory of our general class, but one must still check if any candidate solution, 0, is 
actually a solution, that is, sufficiency is not guaranteed by (|7.7p . 

7.1 Lorentz invariant case 

We will now look at the constraint from Poincare invariance. Recall that we need S^^F'S = F 
under ip ^ ip' = Sip. The l.h.s. of (j7.7p is clearly invariant while the r.h.s. transforms into 

ips-' [e',r] KSdu^ - (^-V^) sip-^p [d,s-^Ki {e',r] s] ^. (7.8) 
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Comparing (j7.8p with (j7.7p . we get 

Thus we have the constraint 
S-^9'S = 9, 

which for an infinitesimal Lorentz transformation gives 

Therefore in total we have 3 constraints, 

constraint 1: 9^ = 6, 
constraint 2: [^iT^] =0, 

constraint 3: 9' - ^-uj"^ [O' , Uab] = 0. 

From constraint 2, 9 must be proportional to / or 7'^. If oc /, then all constraints are satisfied 
but for 6 (X j^, we cannot satisfy constraint 3: Let 9 = g^y^, where 5 is a scalar function of the 
wavefunctions. Then the Poincare transformed 9 is given by 9' = g'^^. Substituting this into 
the left-hand-side of constraint 3, we get 

Since o;'^^ (7^ — 7;,) is non-zero, the result is not proportional to 7" and so oc 7^ does not satisfy 
constraint 3. Thus we conclude that 9 can only be proportional to /. 
Hence with 9 (x I, equation (j7.7p becomes 

= [d^ {0^ ^/^}] ^ = _^ ^/^^ = -fd^9. (7.9) 



Consider the specific case where F is proportional to /. Writing F = fl, we deduce from (|7.9p 
that 

f=J3m. (7.10) 



Remember that is a function of -(/^'s and ■(/''s, and recall our condition (j2.4p : we see therefore 
that 9 must be invariant under a scaling of the wavefunction. As long as the nonlinearities 
cannot be expressed in the form shown in (j7.10p . we can be sure that they cannot be obtained 
by performing a gauge transformation on the linear Dirac equation. In particular we conclude 
that the Lorentz covariant nonlinear Dirac equations we have explicitly studied in this paper 
are not gauge equivalent to the linear Dirac equation. 

Now consider the class of nonlinearities where F is proportional to 7''. We let F = f^i'j^, 
where are functions of ip's and ip^s. Then (j7.9p becomes 

Ui^l^i^ = Uf = -fd^9. (7.11) 

Therefore if cannot be expressed as a total derivative of a scale-invariant 9 function like (j7.1ip 
then those nonlinear structures proportional to 7^ cannot be obtained from the linear Dirac 
equation by a gauge transformation. In particular the cases we considered in Section 16.21 are 
safe. 
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7.2 Lorentz violating cases 

Finally let us consider the case where F is Lorentz violating. We have constructed our Lorentz 
violating terms by introducing a constant background field (independent of the wavefunc- 
tion). We may write F as A^G^ where is the nonlinear factor which may be proportional 
to /, 7'^ etc. 

Could the Lorentz violating examples we have considered be obtained by a nonlinear gauge 
transformation of the linear Dirac equation with or without Lorentz violation? The linear Dirac 
equation to start with would now be of the form 

{il^df, -m)ij + LVil) = 0, 

where LV is a state-independent Lorentz violating term, if it is not zero (we assume that LV 
does not have free derivatives that act to the right on if)) . Gauge transforming this equation with 
a state-dependent but Hermitian 9 I can generate at most Lorentz covariant nonlinearities. So 
consider the other possibility, oc 70. Then one would generate Lorentz violating nonlinearties 
and on the right-hand-side of (jT.ip there would be an additional term ^ [Ly, 70]. Now if we 
write 9 = ^7", (I7.7p becomes 

^F^j = 9^^-1'diij - 9{di'4)^)-i'ij - [^/^ V^V'- (7.12) 

The first observation is that in order to write the right-hand-side in covariant form we need to 
introduce background tensor (for the first two terms) and vector (for the last term) fields. Also 
from the structural form of our F (13. ip . we see by comparing both sides of ()7.12p that 6 must 
be invariant under scaling of the wavefunction. The examples we have explicitly discussed in 
this paper therefore do not fall under the category of nonlinearities described by (I7.12p . For 
example, with F = fl, (j7.12p becomes 



/ 



(7.13) 



which means having at least n = 2 and a simultaneous use of tensor and vector fields: these 
are necessary conditions for the nonlinearity to be obtained through a Lorentz violating gauge 
transformation of the usual linear Dirac equation. 



8 Discussion 

In [18] it was suggested that fundamental quantum nonlinearities might be related to potential 
Lorentz violation [371 ESI 133 SD] . This current paper is a step towards a quantitative study of the 
suggestions in [18]. We have discussed a framework for systematically constructing nonlinear 
Dirac equations, at the quantum mechanical level, that satisfy other conventional properties 
such as Hermiticity, Poincare invariance and Xij) invariance although, as shown, even those 
can be relaxed. 

We gave several examples of such equations, different in structure from those studied pre- 
viously in the literature, and discussed their properties. We also demonstrated that our equations 
were not gauge equivalent to the linear Dirac equation. More explicit examples of our class of 
nonlinear Dirac equations may be found in [4Tj and their non-relativistic limit is studied in [42] . 

As mentioned in Section [H one application of such equations is to study neutrino oscilla- 
tions [13] which would be an ideal probe of quantum nonlinearities, with of without a simultane- 
ous Lorentz violation [18] . Other examples we hope to study with the nonlinear equations are CV 
violation and dark matter/energy. In this regard, it would be useful to obtain non-plane- wave 
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solutions to our nonlinear equations, similar to what has been done for simpler polynomial-type 
nonlinear Dirac equations in |22[ [23] . 

A number of authors had argued that nonlinear quantum evolution of states within the stan- 
dard kinematical framework of quantum theory would lead to pathologies. However, on closer 
examination, such attempts at "no go" theorems were seen to require one or more assump- 
tions that are not very obvious on physical grounds; for detailed critiques and citations to the 
literature the interested reader is referred to |44l H5] . 

We have kept open the possibility that the nonlinearities we proposed might be fundamental, 
effective or only phenomenological. Of course there is less contention if the nonlinearities are 
only an approximate representation of more complex underlying dynamics; in any case, from 
a Wilsonian perspective, one deals in physics with a sequence of approximate theories. 

Effective or phenomenological nonlinear equations are quite common in the non-relativistic 
domain [6l \7\ and there are also a few examples of phenomenological relativistic nonlinear equa- 
tions [Hi [TOl 111] . As another possibility of the latter case, we note that some condensed matter 
systems have (linear) relativistic-looking equations for their quasi-particles [46] : these are surely 
approximations to nonlinear equations. 
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